Previous research (Cohen, 2015) has argued that Paradigmatic Enhancement is the result of competition between representations of contextually viable alternatives. According to Cohen, the paradigmatically related alternatives are stored as phonetically detailed exemplar representations. In a contextually non-deterministic context, exemplars of all alternatives are activated, influencing the pronunciation of the produced form. In other words, the phonetic enhancement of paradigmatically supported forms reflects the lack of reduction due to interference from the pronunciation of the alternative forms. Given this account, we would only expect to see paradigmatic enhancement if inflected representations actually play an important role during production. For Dutch plural inflections, the relative frequency between plural and singular forms has been argued to reflect the degree to which Dutch plurals are composed (by rule or analogy) or represented. Our own model of variable plural distribution seems to support this claim:
load("varPluralData.RData")
library(knitr)
library(aods3)
distribution_model = aodml(cbind(f_s, f_nons) ~ p_s*log_freq_pl + p_s*prop_pl, var)
plot_aodml_effect(var, distribution_model, predictor_var = "p_s", moderator_var = "prop_pl",
constant_vars = c("log_freq_pl"), dependent_var = "prop_s",
predictor_lab = "Probability(-s)", moderator_lab = "Proportion(PL)",
dependent_lab = "Proportion(PL)", moderator_values = "0-0.5-1")
The plot above shows that, when the singular is more frequent than the plural (at low Proportion(PL)), the proportion of the -s variant (Proportion(-s)) can be predicted based on the phonological features of the singular (represented by Probability(-s)). However, when the plural is more frequent than the singular (at high Proportion(PL)), phonological generalization does not work very well as a predictor. Presumeably, this is due to strong representations of the plural forms that resist the phonological pressures.
Given these results our hypothesis is that a higher Proportion(-s) should only result in a longer duration of -s if Proportion(PL) is high.
Below we model log(Duration(-s)) as a function of significant covariates and the interaction between Proportion(-s) and Proportion(PL).
library(knitr)
library(lmerTest)
duration_model = lmer(log_s_dur ~ speech_rate_pron_sc +
PC1_sc + PC2_sc + PC3_sc +
next_phon_class +
register +
prop_s*prop_pl +
(1 | speaker) + (1 | word),
data = s_dur)
s_dur$dur_resid = resid(duration_model)
s_dur_trim = s_dur[abs(scale(s_dur$dur_resid)) < 2.5,]
duration_model_trim = lmer(log_s_dur ~ speech_rate_pron_sc +
PC1_sc + PC2_sc + PC3_sc +
next_phon_class +
register +
prop_s*prop_pl +
(1 | speaker) + (1 | word),
data = s_dur_trim)
library(sjPlot)
plot_model(duration_model_trim, type = "eff", terms = c("prop_s", "prop_pl[0, 0.5, 1]"), colors = "bw", legend.title = "Proportion(PL)", title = "", axis.title = c("Proportion(-s)", "log(duration(-s))"))
The plot above shows that we do find paradigmatic enhancement but only if Proportion(PL) is high. This is in line with our hypothesis. However, the plot also seems to show a reduction effect when Proportion(PL) is low.
Let’s find out whether either the apparent paradigmatic enhancement or the reduction is due to collinear predictors or non-linear relations between the variables.
First, let’s see if either the reduction or the enhancement effect is only apparent by checking whether quadratic predictors improve the model:
duration_model_quad = lmer(log_s_dur ~ speech_rate_pron_sc +
PC1_sc + PC2_sc + PC3_sc +
next_phon_class +
register +
poly(prop_s,2)*poly(prop_pl,2) +
(1 | speaker) + (1 | word),
data = s_dur)
kable(as.matrix(summary(duration_model_quad)$coefficients), caption = "Coefficients")
| Estimate | Std. Error | df | t value | Pr(>|t|) | |
|---|---|---|---|---|---|
| (Intercept) | -2.6698144 | 0.0347143 | 193.70893 | -76.9081670 | 0.0000000 |
| speech_rate_pron_sc | -0.1063218 | 0.0169115 | 549.39564 | -6.2869589 | 0.0000000 |
| PC1_sc | 0.0371866 | 0.0152224 | 572.24011 | 2.4428905 | 0.0148715 |
| PC2_sc | 0.0351433 | 0.0148204 | 570.07292 | 2.3712744 | 0.0180582 |
| PC3_sc | 0.0322952 | 0.0150026 | 577.43301 | 2.1526316 | 0.0317613 |
| next_phon_classAPP | -0.1672756 | 0.0803680 | 575.32980 | -2.0813697 | 0.0378417 |
| next_phon_classF | 0.0229725 | 0.0463662 | 572.72214 | 0.4954568 | 0.6204678 |
| next_phon_classL | -0.0910898 | 0.1373295 | 570.64801 | -0.6632937 | 0.5074103 |
| next_phon_classN | -0.0633337 | 0.0780012 | 571.67854 | -0.8119579 | 0.4171538 |
| next_phon_classP | -0.0562494 | 0.0643604 | 575.02132 | -0.8739758 | 0.3824963 |
| next_phon_classSIL | 0.5136385 | 0.0387770 | 575.24111 | 13.2459479 | 0.0000000 |
| registerstories | 0.1307360 | 0.0374479 | 299.25033 | 3.4911438 | 0.0005534 |
| registernews | -0.0236922 | 0.0659327 | 77.05438 | -0.3593391 | 0.7203243 |
| poly(prop_s, 2)1 | -0.5065052 | 0.4736805 | 53.89692 | -1.0692973 | 0.2897014 |
| poly(prop_s, 2)2 | 0.2306023 | 0.4831137 | 84.63228 | 0.4773252 | 0.6343610 |
| poly(prop_pl, 2)1 | -0.1750697 | 0.5185962 | 34.89452 | -0.3375838 | 0.7376992 |
| poly(prop_pl, 2)2 | -0.0644631 | 0.5155372 | 38.97858 | -0.1250407 | 0.9011340 |
| poly(prop_s, 2)1:poly(prop_pl, 2)1 | 29.5196937 | 12.3277534 | 41.48201 | 2.3945721 | 0.0212382 |
| poly(prop_s, 2)2:poly(prop_pl, 2)1 | 13.3373930 | 13.6540548 | 104.49544 | 0.9768082 | 0.3309201 |
| poly(prop_s, 2)1:poly(prop_pl, 2)2 | -0.5241158 | 12.7705669 | 47.35373 | -0.0410409 | 0.9674358 |
| poly(prop_s, 2)2:poly(prop_pl, 2)2 | 7.1908778 | 13.3990392 | 71.17055 | 0.5366711 | 0.5931685 |
As you can see none of the possible combinations of quadratic predictors improve the model. Only poly(prop_s, 2)1:poly(prop_pl, 2)1, which represents the interaction in which both predictors are linear, is significant.
Now, let’s check the associations between all variables in the model:
library(rcompanion)
library(corrplot)
pred_ass = matrix(c(cramerV(table(s_dur[,c("next_phon_class", "next_phon_class")]), bias.correct = TRUE),
cramerV(table(s_dur[,c("next_phon_class", "register")]), bias.correct = TRUE),
sqrt(summary(lm(speech_rate_pron_sc ~ next_phon_class, data = s_dur))$r.squared),
sqrt(summary(lm(PC1_sc ~ next_phon_class, data = s_dur))$r.squared),
sqrt(summary(lm(PC2_sc ~ next_phon_class, data = s_dur))$r.squared),
sqrt(summary(lm(PC3_sc ~ next_phon_class, data = s_dur))$r.squared),
sqrt(summary(lm(prop_s ~ next_phon_class, data = s_dur))$r.squared),
sqrt(summary(lm(prop_pl ~ next_phon_class, data = s_dur))$r.squared),
sqrt(summary(lm(log_s_dur ~ next_phon_class, data = s_dur))$r.squared),
cramerV(table(s_dur[,c("register", "next_phon_class")]), bias.correct = TRUE),
cramerV(table(s_dur[,c("register", "register")]), bias.correct = TRUE),
sqrt(summary(lm(speech_rate_pron_sc ~ register, data = s_dur))$r.squared),
sqrt(summary(lm(PC1_sc ~ register, data = s_dur))$r.squared),
sqrt(summary(lm(PC2_sc ~ register, data = s_dur))$r.squared),
sqrt(summary(lm(PC3_sc ~ register, data = s_dur))$r.squared),
sqrt(summary(lm(prop_s ~ register, data = s_dur))$r.squared),
sqrt(summary(lm(prop_pl ~ register, data = s_dur))$r.squared),
sqrt(summary(lm(log_s_dur ~ register, data = s_dur))$r.squared),
sqrt(summary(lm(speech_rate_pron_sc ~ next_phon_class, data = s_dur))$r.squared),
sqrt(summary(lm(speech_rate_pron_sc ~ register, data = s_dur))$r.squared),
cor(s_dur$speech_rate_pron_sc, s_dur$speech_rate_pron_sc),
cor(s_dur$speech_rate_pron_sc, s_dur$PC1_sc),
cor(s_dur$speech_rate_pron_sc, s_dur$PC2_sc),
cor(s_dur$speech_rate_pron_sc, s_dur$PC3_sc),
cor(s_dur$speech_rate_pron_sc, s_dur$prop_s),
cor(s_dur$speech_rate_pron_sc, s_dur$prop_pl),
cor(s_dur$speech_rate_pron_sc, s_dur$log_s_dur),
sqrt(summary(lm(PC1_sc ~ next_phon_class, data = s_dur))$r.squared),
sqrt(summary(lm(PC1_sc ~ register, data = s_dur))$r.squared),
cor(s_dur$PC1_sc, s_dur$speech_rate_pron_sc),
cor(s_dur$PC1_sc, s_dur$PC1_sc),
cor(s_dur$PC1_sc, s_dur$PC2_sc),
cor(s_dur$PC1_sc, s_dur$PC3_sc),
cor(s_dur$PC1_sc, s_dur$prop_s),
cor(s_dur$PC1_sc, s_dur$prop_pl),
cor(s_dur$PC1_sc, s_dur$log_s_dur),
sqrt(summary(lm(PC2_sc ~ next_phon_class, data = s_dur))$r.squared),
sqrt(summary(lm(PC2_sc ~ register, data = s_dur))$r.squared),
cor(s_dur$PC2_sc, s_dur$speech_rate_pron_sc),
cor(s_dur$PC2_sc, s_dur$PC1_sc),
cor(s_dur$PC2_sc, s_dur$PC2_sc),
cor(s_dur$PC2_sc, s_dur$PC3_sc),
cor(s_dur$PC2_sc, s_dur$prop_s),
cor(s_dur$PC2_sc, s_dur$prop_pl),
cor(s_dur$PC2_sc, s_dur$log_s_dur),
sqrt(summary(lm(PC3_sc ~ next_phon_class, data = s_dur))$r.squared),
sqrt(summary(lm(PC3_sc ~ register, data = s_dur))$r.squared),
cor(s_dur$PC3_sc, s_dur$speech_rate_pron_sc),
cor(s_dur$PC3_sc, s_dur$PC1_sc),
cor(s_dur$PC3_sc, s_dur$PC2_sc),
cor(s_dur$PC3_sc, s_dur$PC3_sc),
cor(s_dur$PC3_sc, s_dur$prop_s),
cor(s_dur$PC3_sc, s_dur$prop_pl),
cor(s_dur$PC3_sc, s_dur$log_s_dur),
sqrt(summary(lm(prop_s ~ next_phon_class, data = s_dur))$r.squared),
sqrt(summary(lm(prop_s ~ register, data = s_dur))$r.squared),
cor(s_dur$prop_s, s_dur$speech_rate_pron_sc),
cor(s_dur$prop_s, s_dur$PC1_sc),
cor(s_dur$prop_s, s_dur$PC2_sc),
cor(s_dur$prop_s, s_dur$PC3_sc),
cor(s_dur$prop_s, s_dur$prop_s),
cor(s_dur$prop_s, s_dur$prop_pl),
cor(s_dur$prop_s, s_dur$log_s_dur),
sqrt(summary(lm(prop_pl ~ next_phon_class, data = s_dur))$r.squared),
sqrt(summary(lm(prop_pl ~ register, data = s_dur))$r.squared),
cor(s_dur$prop_pl, s_dur$speech_rate_pron_sc),
cor(s_dur$prop_pl, s_dur$PC1_sc),
cor(s_dur$prop_pl, s_dur$PC2_sc),
cor(s_dur$prop_pl, s_dur$PC3_sc),
cor(s_dur$prop_pl, s_dur$prop_s),
cor(s_dur$prop_pl, s_dur$prop_pl),
cor(s_dur$prop_pl, s_dur$log_s_dur),
sqrt(summary(lm(log_s_dur ~ next_phon_class, data = s_dur))$r.squared),
sqrt(summary(lm(log_s_dur ~ register, data = s_dur))$r.squared),
cor(s_dur$log_s_dur, s_dur$speech_rate_pron_sc),
cor(s_dur$log_s_dur, s_dur$PC1_sc),
cor(s_dur$log_s_dur, s_dur$PC2_sc),
cor(s_dur$log_s_dur, s_dur$PC3_sc),
cor(s_dur$log_s_dur, s_dur$prop_s),
cor(s_dur$log_s_dur, s_dur$prop_pl),
cor(s_dur$log_s_dur, s_dur$log_s_dur)
),
nrow = 9, ncol = 9, byrow = T, dimnames = list(
c("Next Phonetic Class", "Register", "Speech Rate", "Prosody 1",
"Prosody 2", "Prosody 3", "Proportion(-s)", "Proportion(PL)", "log(duration(-s))"),
c("Next Phonetic Class", "Register", "Speech Rate", "Prosody 1",
"Prosody 2", "Prosody 3", "Proportion(-s)", "Proportion(PL)", "log(duration(-s))")))
corrplot(pred_ass, method = "number")
As you can see, the predictors of interest and the covariates aren’t very correlated. However, some of the covariates are rather strongly associated with our dependent variable. In order to get a better idea of which data points support the interaction between Proportion(-s) and Proportion(PL) let’s residualize on the covariates first, and then inspect our interaction effect.
duration_model_cov = lmer(log_s_dur ~ speech_rate_pron_sc +
PC1_sc + PC2_sc + PC3_sc +
next_phon_class +
register +
(1 | speaker) + (1 | word),
data = s_dur)
s_dur$resid_dur = resid(duration_model_cov)
s_dur$prop_pl_groups = factor(cut(s_dur$prop_pl, breaks = 3), labels = c("small", "average", "large"))
ggplot(s_dur, aes(x = prop_s,y = resid_dur, color = prop_pl_groups)) +
geom_point(size = .9, alpha = .3) +
geom_smooth(method = "lm", se = F) +
theme_bw() +
labs(x = "Proportion(-s)", y = "residual(duration(-s))", color = "Proportion(PL)") +
ylim(-0.5, 0.5)
From the plot above it becomes obvious that the residuals still contain quite a lot of variance that is not explained by our interaction of interest. As a result, it is hard to see whether either the reduction or the enhancement effect is not supported by the data. Using the interactions package, we can explore this more formally by finding the Johnson-Neyman interval.
library(interactions)
jn = johnson_neyman(duration_model_trim, pred = prop_s, modx = prop_pl, plot = T)
jn$bounds
## Lower Higher
## 0.3115674 0.8684140
jn$plot + xlab("Proportion(PL)") + ylab("Slope of Proportion(-s)")
This tells us that the effect of Proportion(-s) is significant if Proportion(PL) is either below 0.31 or above 0.87. In other words, the significant interaction reflects both a reduction and an enhancement effect. So what is the explanation for the reduction effect?
First of all, we should remember from our distributional study that what Proportion(-s) represents depends on the value of Proportion(PL). At high Proportion(PL), it might be a measure of paradigmatic competition, but at low Proportion(PL), it might represent the amount of phonological support from similar paradigms. Could phonological support result in phonetic reduction? Previous research on phonological neighbourhood size suggests that this might be the case (Gahl, Yao, Johnson, 2012). Is there some way to investigate whether the durational reduction in our data is due to phonological support? We can’t include Probability(-s) and Proportion(-s) in the same model, as the two measures are strongly correlated. But we can include Probability(-s) with the residuals of Proportion(-s) from the distributional model. We can make a number of predictions if we assume that the reduction effect is due to increased phonological support.
Let’s see if these predictions are borne out:
s_dur$pred_prop_s = plogis(predict(distribution_model, newdata = s_dur))
s_dur$resid_prop_s = s_dur$pred_prop_s - s_dur$prop_s
duration_model2 = lmer(log_s_dur ~ speech_rate_pron_sc +
PC1_sc + PC2_sc + PC3_sc +
next_phon_class +
register +
p_s +
resid_prop_s*prop_pl +
(1 | speaker) + (1 | word),
data = s_dur)
s_dur$dur_resid = resid(duration_model2)
s_dur_trim = s_dur[abs(scale(s_dur$dur_resid)) < 2.5,]
duration_model2_trim = lmer(log_s_dur ~ speech_rate_pron_sc +
PC1_sc + PC2_sc + PC3_sc +
next_phon_class +
register +
p_s +
resid_prop_s*prop_pl +
(1 | speaker) + (1 | word),
data = s_dur_trim)
kable(as.matrix(summary(duration_model2_trim)$coefficients), caption = "Coefficients")
| Estimate | Std. Error | df | t value | Pr(>|t|) | |
|---|---|---|---|---|---|
| (Intercept) | -2.5279653 | 0.0593808 | 179.79914 | -42.5721253 | 0.0000000 |
| speech_rate_pron_sc | -0.1138740 | 0.0151382 | 533.21773 | -7.5222865 | 0.0000000 |
| PC1_sc | 0.0423187 | 0.0137060 | 558.62030 | 3.0875930 | 0.0021180 |
| PC2_sc | 0.0372968 | 0.0132933 | 568.32714 | 2.8056772 | 0.0051931 |
| PC3_sc | 0.0394431 | 0.0134518 | 568.46434 | 2.9321790 | 0.0035014 |
| next_phon_classAPP | -0.1548083 | 0.0713879 | 568.48830 | -2.1685528 | 0.0305306 |
| next_phon_classF | 0.0443955 | 0.0411668 | 565.50418 | 1.0784302 | 0.2813016 |
| next_phon_classL | -0.0959535 | 0.1229804 | 562.91140 | -0.7802340 | 0.4355811 |
| next_phon_classN | -0.0597803 | 0.0697033 | 564.95643 | -0.8576396 | 0.3914552 |
| next_phon_classP | -0.0605670 | 0.0582319 | 569.07741 | -1.0401002 | 0.2987350 |
| next_phon_classSIL | 0.5514603 | 0.0349728 | 568.36772 | 15.7682459 | 0.0000000 |
| registerstories | 0.1209972 | 0.0329155 | 292.18502 | 3.6759980 | 0.0002819 |
| registernews | -0.0411585 | 0.0555990 | 66.91342 | -0.7402744 | 0.4617231 |
| p_s | -0.1096361 | 0.0537727 | 100.11135 | -2.0388804 | 0.0440976 |
| resid_prop_s | 0.3907298 | 0.1195568 | 63.12994 | 3.2681528 | 0.0017537 |
| prop_pl | -0.1790010 | 0.0785717 | 53.42592 | -2.2781866 | 0.0267407 |
| resid_prop_s:prop_pl | -0.9052802 | 0.2461469 | 47.47221 | -3.6778047 | 0.0005989 |
The negative coefficient for p_s shows us that increased Probability(-s) does indeed have a reduction effect. Now, let’s explore the interaction between Resid(Proportion(-s)) and Proportion(PL):
plot_model(duration_model2_trim, type = "eff", terms = c("resid_prop_s", "prop_pl[0, 0.5, 1]"), colors = "bw", legend.title = "Proportion(PL)", title = "", axis.title = c("Resid(Proportion(-s))", "log(duration(-s))"))
The cross-over interaction we see here is consistent with an account in which the residuals of the distributional model represent different aspects of variable plural production, depending on the value of Proportion(PL). At low Proportion(PL), the residuals probably represent the errors in the phonological predictions (represented by the Probability(-s) variable). At high Proportion(PL), the residuals represent the unexplained variance in Proportion(-s) due to the variation being stored.